Office Document Search by Semantic Relationship Approach

نویسندگان

  • Somchai Chatvichienchai
  • Katsumi Tanaka
چکیده

Office applications are becoming a major pillar of today’s organizations since they are used to edit a vast amount of digital documents. Finding these office documents from disparate data sources that fit users’ need is an essential task in conducting organization activities. Traditional search tools that employ keyword and phrase matching between the query and search index alone tend to offer high recall and low precision. As the result, search users are faced with too many irrelevant results. In order to solve this problem, we propose a novel search system that effectively searches documents of some office applications by the search query whose definition is based on document type, search terms and semantic relationship between the search terms and the documents. We present a technique that collects search terms and their semantic relationship from the documents to generate the XML-based search indices. Furthermore, our query optimization algorithm is also presented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Search and Navigation in Semantically Integrated Document Collections

The paper presents a novel approach to semantic search and navigation in office-like document collections. The approach is based on a semantic document model that we have developed to enable unique identification, semantic annotation, and semantic linking of document units of officelike documents. In order to semantically annotate document units and to link semantically related document units, ...

متن کامل

Concept-based semantic annotation, indexing and retrieval of office-like document units

We present an ontology-driven approach to semantic annotation, indexing and retrieval of document units. This approach is based on a novel semantic document model (SDM) that we developed to make office-like document units be uniquely identified, semantically annotated with concepts from annotation ontologies and linkable across document boundaries. In the semantic annotation model that we propo...

متن کامل

A Joint Semantic Vector Representation Model for Text Clustering and Classification

Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...

متن کامل

Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology

Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...

متن کامل

Hierarchical Fuzzy Clustering Semantics (HFCS) in Web Document for Discovering Latent Semantics

This paper discusses about the future of the World Wide Web development, called Semantic Web. Undoubtedly, Web service is one of the most important services on the Internet, which has had the greatest impact on the generalization of the Internet in human societies. Internet penetration has been an effective factor in growth of the volume of information on the Web. The massive growth of informat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011